智能论文笔记

Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm

Marek Gagolewski , Maciej Bartoszuk , Anna Cena

分类：机器学习 | (统计)机器学习

2022-09-13

应用分层聚类算法所需的时间最常由成对差异度量的计算数量主导。对于较大的数据集，这种约束使所有经典链接标准的使用都处于不利地位。但是，众所周知，单个连锁聚类算法对离群值非常敏感，产生高度偏斜的树状图，因此通常不会反映出真正的潜在数据结构 - 除非簇分离良好。为了克服其局限性，我们提出了一个名为Genie的新的分层聚类链接标准。也就是说，我们的算法将两个簇链接在一起，以至于选择的经济不平等度量（例如，gini-或bonferroni index）的群集大小不会大大增加超过给定阈值。提出的基准表明引入的方法具有很高的实际实用性：它通常优于病房或平均链接的聚类质量，同时保持单个连锁的速度。 Genie算法很容易平行，因此可以在多个线程上运行以进一步加快其执行。它的内存开销很小：无需预先计算完整的距离矩阵即可执行计算以获得所需的群集。它可以应用于配备有差异度量的任意空间，例如，在实际矢量，DNA或蛋白质序列，图像，排名，信息图数据等上。有关R。另请参见https://genieclust.gagolewski.com有关新的实施（GenieClust） - 可用于R和Python。

translated by 谷歌翻译

Are Cluster Validity Measures (In)valid?

Marek Gagolewski , Maciej Bartoszuk , Anna Cena

分类： (统计)机器学习 | 机器学习

2022-08-02

内部群集有效性度量（例如Calinski-Harabasz，Dunn或Davies-Bouldin指数）经常用于选择适当数量的分区数量，应将数据集分为二。在本文中，我们考虑如果将这些索引视为无监督学习活动中的客观功能会发生什么。关于轮廓指数的最佳分组是否真的有意义？事实证明，许多群集有效性指数促进了聚类，这些聚类与专家知识相匹配。我们还引入了邓恩指数的一个新的，表现出色的变体，该变体是建立在OWA操作员和接近邻居图的基础上的，因此，无论其形状如何，都可以更好地相互分离。

translated by 谷歌翻译

Genetic-tunneling driven energy optimizer for magnetic system

Qichen Xu , Zhuanglin Shen , Manuel Pereiro , Pawel Herman , Olle Eriksson , Anna Delin

分类：神经与进化计算

2022-12-31

Novel topological spin textures, such as magnetic skyrmions, benefit from their inherent stability, acting as the ground state in several magnetic systems. In the current study of atomic monolayer magnetic materials, reasonable initial guesses are still needed to search for those magnetic patterns. This situation underlines the need to develop a more effective way to identify the ground states. To solve this problem, in this work, we propose a genetic-tunneling-driven variance-controlled optimization approach, which combines a local energy minimizer back-end and a metaheuristic global searching front-end. This algorithm is an effective optimization solution for searching for magnetic ground states at extremely low temperatures and is also robust for finding low-energy degenerated states at finite temperatures. We demonstrate here the success of this method in searching for magnetic ground states of 2D monolayer systems with both artificial and calculated interactions from density functional theory. It is also worth noting that the inherent concurrent property of this algorithm can significantly decrease the execution time. In conclusion, our proposed method builds a useful tool for low-dimensional magnetic system energy optimization.

translated by 谷歌翻译

ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports

Katharina Jeblick , Balthasar Schachtner , Jakob Dexl , Andreas Mittermeier , Anna Theresa Stüber , Johanna Topalis , Tobias Weber , Philipp Wesp , Bastian Sabel , Jens Ricke

分类：自然语言处理 | 机器学习

2022-12-30

The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.

translated by 谷歌翻译

Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysis

Maternus Herold , Anna Veselovska , Jonas Jehle , Felix Krahmer

分类：机器学习 | (统计)机器学习

2022-12-30

Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.

translated by 谷歌翻译

TAToo: Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery

Zhaoshuo Li , Hongchao Shu , Ruixing Liang , Anna Goodridge , Manish Sahu , Francis X. Creighton , Russell H. Taylor , Mathias Unberath

分类：计算机视觉 | 人工智能

2022-12-29

Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.

translated by 谷歌翻译

Choosing the Number of Topics in LDA Models -- A Monte Carlo Comparison of Selection Criteria

Victor Bystrov , Viktoriia Naboka , Anna Staszewska-Bystrova , Peter Winker

分类：自然语言处理 | 机器学习 | (统计)机器学习

2022-12-28

Selecting the number of topics in LDA models is considered to be a difficult task, for which alternative approaches have been proposed. The performance of the recently developed singular Bayesian information criterion (sBIC) is evaluated and compared to the performance of alternative model selection criteria. The sBIC is a generalization of the standard BIC that can be implemented to singular statistical models. The comparison is based on Monte Carlo simulations and carried out for several alternative settings, varying with respect to the number of topics, the number of documents and the size of documents in the corpora. Performance is measured using different criteria which take into account the correct number of topics, but also whether the relevant topics from the DGPs are identified. Practical recommendations for LDA model selection in applications are derived.

translated by 谷歌翻译

From Single-Visit to Multi-Visit Image-Based Models: Single-Visit Models are Enough to Predict Obstructive Hydronephrosis

Stanley Bryan Z. Hua , Mandy Rickard , John Weaver , Alice Xiang , Daniel Alvarez , Kyla N. Velear , Kunj Sheth , Gregory E. Tasian , Armando J. Lorenzo , Anna Goldenberg

分类：计算机视觉 | 人工智能

2022-12-27

Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.

translated by 谷歌翻译

Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images

Mohammad Mahdi Behzadi , Mohammad Madani , Hanzhang Wang , Jun Bai , Ankit Bhardwaj , Anna Tarakanova , Harold Yamase , Ga Hie Nam , Sheida Nabavi

分类：计算机视觉

2022-12-25

Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming and has known interobserver variability. In the past few years, deep learning algorithms have been used to analyze histopathology images, delivering promising results for grading prostate cancer. However, most of the algorithms rely on the fully annotated datasets which are expensive to generate. In this work, we proposed a novel weakly-supervised algorithm to classify prostate cancer grades. The proposed algorithm consists of three steps: (1) extracting discriminative areas in a histopathology image by employing the Multiple Instance Learning (MIL) algorithm based on Transformers, (2) representing the image by constructing a graph using the discriminative patches, and (3) classifying the image into its Gleason grades by developing a Graph Convolutional Neural Network (GCN) based on the gated attention mechanism. We evaluated our algorithm using publicly available datasets, including TCGAPRAD, PANDA, and Gleason 2019 challenge datasets. We also cross validated the algorithm on an independent dataset. Results show that the proposed model achieved state-of-the-art performance in the Gleason grading task in terms of accuracy, F1 score, and cohen-kappa. The code is available at https://github.com/NabaviLab/Prostate-Cancer.

translated by 谷歌翻译

Automatic stabilization of finite-element simulations using neural networks and hierarchical matrices

Tomasz Sluzalec , Mateusz Dobija , Anna Paszynska , Ignacio Muga , Maciej Paszynski

分类：机器学习

2022-12-24

Petrov-Galerkin formulations with optimal test functions allow for the stabilization of finite element simulations. In particular, given a discrete trial space, the optimal test space induces a numerical scheme delivering the best approximation in terms of a problem-dependent energy norm. This ideal approach has two shortcomings: first, we need to explicitly know the set of optimal test functions; and second, the optimal test functions may have large supports inducing expensive dense linear systems. Nevertheless, parametric families of PDEs are an example where it is worth investing some (offline) computational effort to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters, in an online stage. Therefore, as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions (in a basis expansion) associated with that PDE-parameter. Next, as a remedy for the second shortcoming, we use the low-rank approximation to hierarchically compress the (non-square) matrix of coefficients of optimal test functions. In order to accelerate this process, we train a neural network to learn a critical bottleneck of the compression algorithm (for a given set of PDE-parameters). When solving online the resulting (compressed) Petrov-Galerkin formulation, we employ a GMRES iterative solver with inexpensive matrix-vector multiplications thanks to the low-rank features of the compressed matrix. We perform experiments showing that the full online procedure as fast as the original (unstable) Galerkin approach. In other words, we get the stabilization with hierarchical matrices and neural networks practically for free. We illustrate our findings by means of 2D Eriksson-Johnson and Hemholtz model problems.

translated by 谷歌翻译